The data is used is fetched by using the Spotify package in R called “spotifyr”. I also use a dataset created with lastfm who has been storing my activity since September 2021. However, because spotifyr has some limitations with the number of requests I can do, I limited my data to the artists Spotify thinks I listen most, and the artists I think I listen most. I also just use the the data gathered in 2023 from my lastfm dataset. During exploration I noticed, that there are too many rows that do not represent what I listen actively. I often just leave running Spotify running in the background and this resulted in the dataset having many songs that have been listened less than 2 times. Also, the less I have to fetch with the spotifyr package the better, because the fetch commands take a long time.

The following plots just explain trends I follow when actively listening to music.

Audio Features:

danceability: The danceability feature typically ranges from 0 to 1, where 0 indicates a track that is not suitable for dancing, and 1 represents a highly danceable track.

energy: The energy feature often ranges from 0 to 1 as well, where 0 indicates low energy or calmness, and 1 represents high energy or intensity.

key: The key feature represents different musical keys, typically ranging from 0 to 11, with each number corresponding to a specific key.

mode: The mode feature is binary and can take the values 0 or 1, where 0 represents a minor key and 1 represents a major key.

loudness: The loudness feature is expressed in decibels (dB) and typically ranges from -60 dB to 0 dB. Higher values indicate louder tracks, whereas lower values indicate quieter tracks.

speechiness: The speechiness feature ranges from 0 to 1, where 0 indicates a track that is predominantly instrumental, and 1 represents a track that is predominantly spoken or contains spoken words.

acousticness: The acousticness feature also ranges from 0 to 1, where 0 indicates a track that is not acoustic or contains minimal acoustic elements, and 1 represents a track that is purely acoustic.

tempo: The tempo feature represents the beats per minute (BPM) and can range from very low values (e.g., 40 BPM) to very high values (e.g., 200 BPM) or even more, depending on the track’s tempo.

## `summarise()` has grouped output by 'artist_popularity'. You can override using
## the `.groups` argument.
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Spotify has an issue when storing metadata. They only store the genres of artists but not of songs.

## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

## # A tibble: 11 × 2
##    genres           name       
##    <chr>            <chr>      
##  1 british invasion The Beatles
##  2 classic rock     The Beatles
##  3 merseybeat       The Beatles
##  4 psychedelic rock The Beatles
##  5 rock             The Beatles
##  6 alternative rock Radiohead  
##  7 art rock         Radiohead  
##  8 melancholia      Radiohead  
##  9 oxford indie     Radiohead  
## 10 permanent wave   Radiohead  
## 11 rock             Radiohead
## Warning in left_join(audio_features, top_artists, join_by(artist_id == id)): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 367 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.